<html>
<head>
  <title>METviewer Documentation</title>
  <link rel="stylesheet" type="text/css" href="mv_doc.css"/>
  <link rel="shortcut icon" href="include/ral_icon.ico" type="image/x-icon"/>
</head>
<body>

<p class="loc" style="padding-top:10px">
  <b>Location:</b> <a class="loc" href="index.html">Home</a> &#187; Database scrubbing
</p>
<hr/>

<h2>METviewer Documentation - Database Scrubbing Module</h2>

<p>The database scrubbing utility is used to to delete data from METviewer databases that meets
  some user-specified selection criteria. The usage statement:</p>

<p class="term">
  ---- Database Scrubbing ----<br/><br/>
  Usage: mv_prune.sh prune_db_spec_file <br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;where prune_db_spec_file specifies the XML pruning specification document<br/>

  ---- Database Scrubbing Done ----<br/>
</p>

The <span class="term">prune_db_spec_file</span> passed to the prunning module contains information about the criteria for deleting data. It is an XML file whose
top-level tag is <span class="code">&lt;prune_spec&gt;</prune_spec></span> which contains the elements described below, divided into functional sections.  The data to be deleted may be specified in one of three ways: as a list of file names, as a list of directories, or as a set of values for various fields in the data.


<ul>
  <li><span class="code">&lt;connection&gt;</span> see <a href="common_xml.html">common xml</a></li>
  <li><span class="code">&lt;info_only&gt;</span>: <span class="code">true</span> or <span class="code">false</span>, indicates if the data to be deleted should only be listed (<span class="code">true</span>) or the actual deletion performed (<span class="code">false</span>)</li>

  <li>
    <span class="code">&lt;fields&gt;</span>: a list of fields used for pruning
    <ul>
      <li><span class="code">&lt;value_range&gt;</span> an inclusive range for continuous variables, such as dates
        <ul>
          <li><span class="code">&lt;start&gt;</span> beginning of the range
          <li><span class="code">&lt;end&gt;</span> end of the range
        </ul>
      </li>
      OR
      <li><span class="code">&lt;value_list&gt;</span> a list of values, such as models
        <ul>
          <li><span class="code">&lt;value&gt;</span> single value
        </ul>
      </li>
    </ul>
  </li>

  OR

  <li>
    <span class="code">&lt;files&gt;</span>: a set of files to be removed
    <ul>
      <li><span class="code">&lt;file&gt;</span> File name
    </ul>
  </li>

  OR
  <li>
    <span class="code">&lt;folders&gt;</span>: a set of directories to be removed
    <ul>
      <li><span class="code">&lt;folder_tmpl&gt;</span>: a template string describing the file structure of the MET files, which is populated with values
        specified in the
        <span class="code">&lt;load_val&gt;</span> tag structure
      </li>
      <li><span class="code">&lt;date_list&gt;</span> see <a href="common_xml.html">common xml</a></li>
      <br/>
      <li>
        <span class="code">&lt;load_val&gt;</span>: a tree structure containing values used to populate the <span class="code">&lt;folder_tmpl&gt;</span>
        template
        <ul>
          <li><span class="code">&lt;field&gt;</span>: a template value, whose name is specified by the attribute name, and whose values are specified by its
            children
            <span class="code">&lt;val&gt;</span> tags
          </li>
          <ul>
            <li><span class="code">&lt;val&gt;</span>: a single template value which will slot into the template in the value specified by the parent field's
              name
            </li>
            <li><span class="code">&lt;date_list&gt;</span>: specifies a previously declared <span class="code">&lt;date_list&gt;</span> element, using the name
              attribute,
              which represents a list of dates in a particular format
            </li>
          </ul>
        </ul>
      </li>
    </ul>
  </li>
</ul>


<p><span class="code">Example 1: Prune by describing the data (&ltfields&gt)</span></p>

The following configuration will remove data from database 'mv_database' that has model names NAM and GFS, forecast variable APCP_03<br/> and forecast valid dates
between 2013-07-05 06:00 and 2013-07-05 18:00:00<br/>
<p class="code">
  &lt;prune_spec&gt;<br/>
  &nbsp;&nbsp;&lt;connection&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;host&gt;<span class="term">db_host:3306</span>&lt;/host&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;database&gt;<span class="term">mv_database</span>&lt;/database&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;user&gt;user_name&lt;<span class="term">user</span>&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;password&gt;user_password&lt;<span class="term">password</span>&gt;<br/>
  &nbsp;&nbsp;&lt;/connection&gt;<br/>
  &nbsp;&nbsp;&lt;info_only&gt;<span class="term">false</span>&lt;/info_only&gt;<br/>
  <br/>
  &nbsp;&nbsp;&lt;fields&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;field name="<span class="term">fcst_valid_beg</span>"&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;value_range&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;start&gt;<span class="term">2013-07-05 06:00:00</span>&lt;/start&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;end&gt;<span class="term">2013-07-05 18:00:00</span>&lt;/end&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;/value_range&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;/field&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;field name="<span class="term">model</span>"&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;value_list&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;value&gt;<span class="term">NAM</span>&lt;/value&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;value&gt;<span class="term">GFS</span>&lt;/value&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;/value_list&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;/field&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;field name="<span class="term">fcst_var</span>"&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;value_list&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;value&gt;<span class="term">APCP_03</span>&lt;/value&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;/value_list&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;/field&gt;<br/>
  &nbsp;&nbsp;&lt;/fields&gt;<br/>
  &lt;/prune_spec&gt;<br/>
</p>

<p><span class="code">Example 2: Prune by a list of files (&lt;files&gt;) and by a list of directories (&lt;folders&gt;)</span></p>

This configuration will remove data from database 'mv_database' that was loaded from the following files:<br/>
<p class="term" style="padding-left: 40px">
/d3/metprd/grid_stat/grid_stat_APCP_03_030000L_20130705_030000V.stat<br/>
/d3/metprd/mode/mode_APCP_06_180000L_20130705_180000V_060000A_obj.txt<br/>
</p>

And from the following directories:<br/>
<p class="term" style="padding-left: 40px">
  /d1/data/arw/FULL/2010051914<br/>
  /d1/data/arw/SWC/2010051914<br/>
  /d1/data/nmm/FULL/2010051914<br/>
  /d1/data/nmm/SWC/2010051914<br/>
  /d1/data/arw/FULL/2010051915<br/>
  /d1/data/arw/SWC/2010051915<br/>
  /d1/data/nmm/FULL/2010051915<br/>
  /d1/data/nmm/SWC/2010051915<br/>
</p>

<p class="code">
  &lt;prune_spec&gt;<br/>
  &nbsp;&nbsp;&lt;connection&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;host&gt;<span class="term">db_host:3306&lt;</span>/host&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;database&gt;<span class="term">mv_database</span>&lt;/database&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;user&gt;<span class="term">user_name</span>&lt;/user&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;password&gt;<span class="term">user_password</span>&lt;/password&gt;<br/>
  &nbsp;&nbsp;&lt;/connection&gt;<br/>
  &nbsp;&nbsp;&lt;info_only&gt;<span class="term">false</span>&lt;/info_only&gt;<br/>
  <br/>
  &nbsp;&nbsp;&lt;files&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;file&gt;<span class="term">/d3/metprd/grid_stat/grid_stat_APCP_03_030000L_20130705_030000V.stat</span>&lt;/file&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;file&gt;<span class="term">/d3/metprd/grid_stat/grid_stat_APCP_03_030000L_20130705_030000V.stat</span>&lt;/file&gt;<br/>

  &nbsp;&nbsp;&lt;/files&gt;<br/>
  &nbsp;&nbsp;&lt;folders&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;date_list name="<span class="term">folder_dates</span>"&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;start&gt;<span class="term">2010051914</span>&lt;/start&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;end&gt;<span class="term">2010051915</span>&lt;/end&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;inc&gt;<span class="term">3600</span>&lt;/inc&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;format&gt;<span class="term">yyyyMMddHH</span>&lt;/format&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;/date_list&gt;<br/><br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;folder_tmpl&gt;<span class="term">/d1/data/{model}/{vx_mask}/{valid_time}</span>&lt;/folder_tmpl&gt;<br/><br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;load_val&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;field name="<span class="term">model</span>"&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;val&gt;<span class="term">arw</span>&lt;/val&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;val&gt;<span class="term">nmm</span>&lt;/val&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;/field&gt;<br/><br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;field name="<span class="term">valid_time</span>"&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;date_list name="<span class="term">folder_dates</span>"/&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;/field&gt;<br/><br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;field name="<span class="term">vx_mask</span>"&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;val&gt;<span class="term">FULL</span>&lt;/val&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;val&gt;<span class="term">SWC</span>&lt;/val&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;/field&gt;<br/>
  &nbsp;&nbsp;&nbsp;&nbsp;&lt;/load_val&gt;<br/><br/>

  &nbsp;&nbsp;&lt;/folders&gt;<br/>

  &lt;/prune_spec&gt;<br/>

</p>
</body>
</html>
